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Abstract 

Data from two hundred college-level tests were used to compare three 
reliability approximations (two of Saupe and one of Cureton) to KR20. While 
the approximations correlated highly (about .9) with the reliability estimate, 
they tended to be underapproximations. The explanation lies in an apparent 
bias of Lord's approximation to the standard error of measurement. Until 
further investigation is completed, it is suggested that these approximations 
be used only for comparisons among tests of similar average relative difficulty. 
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The purpose of this study was to investigate three related approximations 
to a test's reliability. These approximations have been suggested by those 
who developed them (Cureton and others, 1973 and Saupe, 1961) and by others 
(e.g., McMorris, 1972 and Payne, 1974) to be sufficiently accurate to warrant 
their use before, instead of, and/or in addition to computing such estimates 
as the familiar Kuder-Richardson Formula 20 (KR20). 

Each of these approximations is based on Lord's (1957, 1959) approximation 
to the standard error of measurement, 

SE = (1) 

^■^Meas 7 

where n is the number of items in a test. If the right-hand side of 
(1) is set equal to Sj^/TT^ and solved for r^^, 

This result was first noted by Saupe, who also provided an unbaised estimate, 

^X 

In both (2) and (3), r^^^^. is the reliability of the test X, and S^^^ is 
its variance. 

Much earlier, Jenkins (1946) provided an approximation to the standard 
deviation, 

S : mi/6) - £L(l/6) ^ . . 

X .5N 

zU(l/6) is the sum of the highest one-sixth of the scores; >:L(l/6) is the 
sum of the lowest one-sixth; and N is the number of examinees. 3y squaring 
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(4) and substituting it for S^^ in (2), Cureton and others obtained 

^XX- = 1- -O^^n [^^^JL^^.^ . (5) 

Thus there are at least three approximations to the internal consistency 

of a test, namely Equations (2), (3), and (5). In this study they are"~ 

designated RSI, RS2, and RC, respectively. 

To compare the. results of these approximations to KR20, both Pearson 
correlations and the algebraic and absolute differences between the 
approximation and KR20 were computed. Also, Equct.ions (2), and (3), and (5) 
were set cqudl to KR20 and solved for the coefficeint of n in each case. 

Tests and Computations 

The tests used were 210 classroom tests for which machine-scorable 
answer sheets were submitted to Syracuse University's Test Scoring and 
Evaluation Services for scoring and analysis during a two-month period. 
Two of the tests were subsequently eliminated from the study when it was 
discovered that the instructor had submitted blank answer sheets for absent 
students. The aparent effect of the resulting "scores" of zero cn both 
the variances and the KR20s was marked. Another eight tests were eliminated 
from most of the study because they had negative KR20s. Selected descriptive 
statistics for the tests are presented in Table 1. 

The answer sheets were scanned on an OpScan DM100 scanner, and routine 
test analysis statistics, including KR20 and SEj^^^^ based on KR20 were 

computed using a locally-developed test analysis program. Further computations, 
specific to this study, were done with subprograms of the Statistical package 



5 

o 

ERIC 



4 



Table 1 - 
Test Statistics 





200 Tests with KR20 


208 


tests, regardless 


Statistic 


greater than 


zero 




of KR20 






Mean 


SD 


Limits 


Mean 


SD 


Limits 


Number of items 


44.8 


29.8 


8 - 140 


43.6 


29.8 


8 - 140 


Number of examinees 


90.0 


104.9 


6 - 333 


87.1 


103.8 


5 - 883 


Mean 


30.4 


20.5 


4.0 - 102.4 


29.5 


20.6 


2.1 - 102.4 


Standard deviation 


5.4 


3.4 


1.3 - 17.4 


5.3 


3.5 


0.7 - 17.4 


KR20 


.663 


.209 . 


025 - .945 


.636 


.247 


-.144 - .945 


Standard error of 


2.5 


0.9 


1.0 - 5.0 


2.4 


0.9 


.7 - 5.0 



measurement 



for the s ocial sciences , (Nie and others, 1975). All data processing was 
performed on an IBM 370/155 computer. 

Resul ts 

Correlations among the three approximations and KR20 are presented in 
Table 2. The values above the diagonal in Table 2 are for all 208 cases. 

Table 2 

Correlations among KR20 and Three Approximations 



KR20 RSI RS2 R_C 

KR20 .862 .847 .376 

RSI .929 .999 .984 

RS2 .925- .999 .981 

RC .916 .982 .980 

Those below the diagonal are for the two hundred cases with KR20 greater than 

zero. All subsequent data are for these two hundred cases only. The differences 
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between the approximations and their appropriate statistics are sumniarized 
in Table 3, and the coefficients coniputocl Uy sotLinc] C(|uations {?.) , (3), 
and (5) equal to KR20 are presented in Table 4. 

Table 3 





Summary Statistics for Differences 


between 






KR20 and 


Three Approximations 






Difference 


Mean 


SD 


Minimum 


Maximum 












Algebraic 


-.074 


. 1.08 


-.692 


.103 


Absolute 


. UcS 1 


.103 


.001 


.692 


RS2-KR20 










Algebraic 


-.093 


.132 


-.822 


.059 


Absolute 


.098 


.128 


.000 


.822 


RC-KR20 










Algebraic 


-.076 


.121 


-.687 


.131 


Absolute 


.088 


.113 
Table 4 


.000 


.687 




Coefficients 


for Three Approximations 




Approximation 


Coefficient 
in Equation 


Mean SD Minimum 


Maximum 


RSI 


.134 


.157 .026 


.080 


.209 


RS2 


.200 


.169 .028 


.088 


.228 


RC 


.043 


.037 .007 


.019 


.051 



Discussion 

On the evidence in Table 2, the three approximations would be judged 
to be essentially the same and quite closely related to KR20. That is, 
these approximations are very highly correlated with each other and also 
with KR20. However, scrunity of Tables 3 and 4 suggests a systematic 
difference between each of these approximations and KR20. The approximations 
tended to yield values that are smaller than the criterion statistic. 
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Further, the inaccuracy is systematTcally related to the relative 
difficulty of the test. All but eleven of the tests had mean item difficulty 
indices (mean total scores divided by number of items) that were greater than 
0.5. The correlations between that variable (p) and the algebraic differences 
were -.65, -.60, and -,57 for RSI, RS2, and RC respectively. When the variable 
pq = p(l-p) was correlated with these differences, values of .75, .70, and,66 
were obtained. That is, the more p departed from 0.5, the greater the size 
of underapproximation. 

Correlations between p and the coefficients that would have set Equations 
(2), (3), and (5) equal to KR20 were -.85, -.84, and -.76, respectively. For 
pq, these correlations rose to .89, .89, and .80. The joint distribution of 
p and the coefficient for RSI is presented in Table 5. These strong relationships, 
however, did not carry through to the relationships between relative difficulty 
and KR20. The correlation between KR20 and p was .01; that between pq and 
KR20 was. 04. 

The apparent source of \:he inaccuracies lies in Equation (1). There were 
strong and consistent relationships between both p and pq and all measures of 
the inaccuracies from Equation (1). These are being separately investigated 
currently. ^ 

At this point, it appears that one can safely conclude, however, that 
there are systematic inaccuracies in Lord's approximation and in approximations 
based on it. The reliability approximations are nonetheless useful for comparing 
two tests with approximately the same relative difficulty, for setting a 
probably lower limit to KR20, and for rough approximations, during the test 
development process. 

As has already been stated, investigation is continuing into the nature, 
source and generality of these inaccuracies. 
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